An Approach for Hierarchical System Level Diagnosis of Massively Parallel Computers Combined with a Simulation-Based Method for Dependability Analysis

نویسندگان

  • Jörn Altmann
  • Frank Balbach
  • Axel Hein
چکیده

The primary focus in the analysis of massively parallel supercomputers has traditionally been on their performance. However, their complex network topologies, large number of processors, and sophisticated system software can make them very unreliable. If every failure of one of the many components of a massively parallel computer could shut down the machine, the machine would be useless. Therefore fault tolerance is required. The basis of effective m~hanisms for fault tolerance is an efficient diagnosis. This paper deals with concurrent and hierarchical system level diagnosis for a particular massively parallel architecture and with a sinaulation-based method to validate the proposed diagnosis algorithm. The diagnosis algorithm is presented and we describe a simulation-based method to test and verify the algorithms for fault tolerance already during the design phase of the target machine.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Simple Approach to Static Analysis of Tall Buildings with a Combined Tube-in-tube and Outrigger-belt Truss System Subjected to Lateral Loading

In this paper, an efficient technique is presented for static analysis of tall buildings with combined tube-in-tube and outrigger-belt truss system while considering shear lag effects. In the process of replacing the discrete structure with an elastically equivalent continuous one, the structure is modeled as two parallel cantilevered flexural-shear beams that are constrained at the outrigger-b...

متن کامل

COMPOSITION OF ISOGEOMETRIC ANALYSIS WITH LEVEL SET METHOD FOR STRUCTURAL TOPOLOGY OPTIMIZATION

In the present paper, an approach is proposed for structural topology optimization based on combination of Radial Basis Function (RBF) Level Set Method (LSM) with Isogeometric Analysis (IGA). The corresponding combined algorithm is detailed. First, in this approach, the discrete problem is formulated in Isogeometric Analysis framework. The objective function based on compliance of particular lo...

متن کامل

Analysis of Emergency Department Queue System Performance: Simulation Approach Based on Experiment Design

Background: Simulation is an appropriate technique for analyzing and evaluating the dynamic behavior of complex systems. The present study aimed to develop an integrated model using a simulation approach based on designing experiments to analyze performance of the admission queue system of patients, who referred to the emergency department of the Modarres hospital. Methods: In this descriptive...

متن کامل

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...

متن کامل

EEG Artifact Removal System for Depression Using a Hybrid Denoising Approach

Introduction: Clinicians use several computer-aided diagnostic systems for depression to authorize their diagnosis. An electroencephalogram  (EEG) may be used as an objective tool for early diagnosis of depression and controlling it from reaching a severe and permanent state. However, artifact contamination reduces the accuracy in EEG signal processing systems. Methods: This work proposes a no...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994